Design a data structure that simulates an in-memory file system.
Implement the FileSystem class:
FileSystem()Initializes the object of the system.List<String> ls(String path)- If
pathis a file path, returns a list that only contains this file's name. - If
pathis a directory path, returns the list of file and directory names in this directory.
- If
void mkdir(String path)Makes a new directory according to the givenpath. The given directory path does not exist. If the middle directories in the path do not exist, you should create them as well.void addContentToFile(String filePath, String content)- If
filePathdoes not exist, creates that file containing givencontent. - If
filePathalready exists, appends the givencontentto original content.
- If
String readContentFromFile(String filePath)Returns the content in the file atfilePath.
Example 1:
Input
["FileSystem", "ls", "mkdir", "addContentToFile", "ls", "readContentFromFile"]
[[], ["/"], ["/a/b/c"], ["/a/b/c/d", "hello"], ["/"], ["/a/b/c/d"]]
Output
[null, [], null, null, ["a"], "hello"]
Explanation
FileSystem fileSystem = new FileSystem();
fileSystem.ls("/"); // return []
fileSystem.mkdir("/a/b/c");
fileSystem.addContentToFile("/a/b/c/d", "hello");
fileSystem.ls("/"); // return ["a"]
fileSystem.readContentFromFile("/a/b/c/d"); // return "hello"
Constraints:
1 <= path.length, filePath.length <= 100pathandfilePathare absolute paths which begin with'/'and do not end with'/'except that the path is just"/".- You can assume that all directory names and file names only contain lowercase letters, and the same names will not exist in the same directory.
- You can assume that all operations will be passed valid parameters, and users will not attempt to retrieve file content or list a directory or file that does not exist.
1 <= content.length <= 50- At most
300calls will be made tols,mkdir,addContentToFile, andreadContentFromFile.
Average Rating: 4.95 (60 votes)
Solution
Approach #1 Using separate Directory and File List[Accepted]
We start our discussion by looking at the directory structure used. The root directory acts as the base of the directory structure. Each directory contains two hashmaps namely dirs and files. The dirs contains data in the form [(subdirectory1_name:subdirectory1_structure),(subdirectory2_name:subdirectory2_structure)...]. The files contains data in the form [(file1:file1_contents),(file2:file2_contents)...]. This directory structure is shown below with a sample showing just the first two levels.
Now, we'll discuss how we implement the various commands required.
-
ls: In this case, we start off by initializing t, a temporary directory pointer, to the root directory. We split the input directory path based on/and obtain the individual levels of directory names in a d array. Then, we traverse over the tree directory structure based on the individual directories found and we keep on updating the t directory pointer to point to the new level of directory(child) as we go on entering deeper into the directory structure. At the end, we will stop at either the end level directory or at the file name depending upon the input given. If the last level in the input happens to be a file name, we simply need to return the file name. So, we directly return the last entry in the d array. If the last level entry happens to be a directory, we can obtain its subdirectory list from the list of keys in its dirs hashmap. Similarly, we can obtain the list of files in the last directory from the keys in the corresponding files hashmap. We append the two lists obtained, sort them and return the sorted appended list. -
mkdir: In response to this command, as in case ofls, we start entering the directory structure level by level. Whenever we reach a state where a directory mentioned in the path ofmkdirdoesn't exist, we create a new entry in the last valid directory's dirs structure and initialize its subdirectory list as an empty list. We keep on doing so till we reach the end level directory. -
addContentToFile: In response to this command as well, as in case ofls, we start entering the directory structure level by level. When we reach the level of the file name, we check if the file name already exists in the files keys. If it exists, we concatenate the current contents to the contents of the file(in the value section of the corresponding file). If it doesn't exist, we create a new entry in the current directory's files and initialize its contents with the current contents. -
readContentFromFile: As the previous cases, we reach the last directory level by traversing through the directory structure level by level. Then, in the last directory, we search for the file name entry in the corresponding files' keys and return its corresponding value as the contents of the file.
Performance Analysis
-
The time complexity of executing an
lscommand is O(m+n+klog(k)). Here, m refers to the length of the input string. We need to scan the input string once to split it and determine the various levels. n refers to the depth of the last directory level in the given input forls. This factor is taken because we need to enter n levels of the tree structure to reach the last level. k refers to the number of entries(files+subdirectories) in the last level directory(in the current input). We need to sort these names giving a factor of klog(k). -
The time complexity of executing an
mkdircommand is O(m+n). Here, m refers to the length of the input string. We need to scan the input string once to split it and determine the various levels. n refers to the depth of the last directory level in themkdirinput. This factor is taken because we need to enter n levels of the tree structure to reach the last level. -
The time complexity of both
addContentToFileandreadContentFromFileis O(m+n). Here, m refers to the length of the input string. We need to scan the input string once to split it and determine the various levels. n refers to the depth of the file name in the current input. This factor is taken because we need to enter n levels of the tree structure to reach the level where the files's contents need to be added/read from. -
The advantage of this scheme of maintaining the directory structure is that it is expandable to include even more commands easily. For example,
rmdirto remove a directory given an input directory path. We need to simply reach to the destined directory level and remove the corresponding directory entry from the corresponding dirs keys. -
Renaming files/directories is also very simple, since all we need to do is to create a temporary copy of the directory structure/file with a new name and delete the last entry.
-
Relocating a hierarchichal subdirectory structure from one directory to the other is also very easy, since, all we need to do is obtain the address for the corresponding subdirectory class, and assign the same at the new positon in the new directory structure.
-
Extracting only directories or files list on any path is easy in this case, since we maintain separate entires for dirs and files.
Approach #2 Using unified Directory and File List[Accepted]
This design differs from the first design in that the current data structure for a Directory contains a unified files hashmap, which contains the list of all the files and subdirectories in the current directory. Apart from this, we contain an entry isfile, which when True indicates that the current files entry is actually corresponding to a file, otherwise it represents a directory. Further, since we are considering the directory and files' entries in the same manner, we need an entry for content, which contains the contents of the current file(if isfile entry is True in the current case). For entries corresponding to directories, the content field is kept empty.
The following figure shows the directory structure for the same example as in the case above, for the first two levels of the hierarchical structure.
The implementation of all the commands remains the same as in the last design, except that we need to make entries in the same files hashmap for both files and directories, corresponding to addContentToFile and mkdir respectively. Further, for ls, we need not extract entries separately for the files and directories, since they are unified in the current case, and can be obtained in a single go.
This approach is inspired by @shawngao
Performance Analysis
-
The time complexity of executing an
lscommand is O(m+n+klog(k)). Here, m refers to the length of the input string. We need to scan the input string once to split it and determine the various levels. n refers to the depth of the last directory level in the given input forls. This factor is taken because we need to enter n levels of the tree structure to reach the last level. k refers to the number of entries(files+subdirectories) in the last level directory(in the current input). We need to sort these names giving a factor of klog(k). -
The time complexity of executing an
mkdircommand is O(m+n). Here, m refers to the length of the input string. We need to scan the input string once to split it and determine the various levels. n refers to the depth of the last directory level in themkdirinput. This factor is taken because we need to enter n levels of the tree structure to reach the last level. -
The time complexity of both
addContentToFileandreadContentFromFileis O(m+n). Here, m refers to the length of the input string. We need to scan the input string once to split it and determine the various levels. n refers to the depth of the file name in the current input. This factor is taken because we need to enter n levels of the tree structure to reach the level where the files's contents need to be added/read from. -
The advantage of this scheme of maintaining the directory structure is that it is expandable to include even more commands easily. For example,
rmdirto remove a directory given an input directory path. We need to simply reach to the destined directory level and remove the corresponding directory entry from the corresponding dirs keys. -
Renaming files/directories is also very simple, since all we need to do is to create a temporary copy of the directory structure/file with a new name and delete the last entry.
-
Relocating a hierarchichal subdirectory structure from one directory to the other is also very easy, since, all we need to do is obtain the address for the corresponding subdirectory class, and assign the same at the new positon in the new directory structure.
-
If the number of directories is very large, we waste redundant space for isfile and content, which wasn't needed in the first design.
-
A problem with the current design could occur if we want to list only the directories(and not the files), on any given path. In this case, we need to traverse over the whole contents of the current directory, check for each entry, whether it is a file or a directory, and then extract the required data.
TreeMap can be used instead of HashMap to maintain lexicographic ordering
December 6, 2018 1:01 AM
@vinod23 Thanks for sharing. It's a very detailed and clear article.
However, there is one thing I'd like to discuss. In the performance analysis, I think the 'n' part could be removed. I want to claim that n = O(m), so O(m+n) = O(m). This can be easily proven as following, if the length of the path string is m, then the max levels of hierarchies (intermediate directories) it can contain is at most m/2, where each directory is a single character. Thus, the levels we must enter is n = m/2 = O(m).
May 25, 2020 6:28 AM
The variable naming of the first solution is really bad.
August 17, 2020 10:27 AM
I dont think its a hard level. Frankly its not complex. One may not come up with a solution, but complexity level of the solution is not hard.
November 27, 2020 3:41 AM
who the hell can finish this in an hour interview?
The test 59 of 63 gives the wrong result, even if entering the test case manually it actually pass.
["FileSystem","ls","mkdir","addContentToFile","ls","readContentFromFile","addContentToFile","readContentFromFile","addContentToFile","ls","readContentFromFile"]
[[],["/"],["/a/b/c"],["/a/b/c/d","hello world"],["/"],["/a/b/c/d"],["/a/b/c/d"," hello hello world"],["/a/b/c/d"],["/a/b/c/dddd","winstons"],["/a/b/c"],["/a/b/c/dddd"]]
The right output should be :
[null,[],null,null,["a"],"hello world",null,"hello world hello hello world",null,["d","dddd"],"winstons"]
Instead the test says it expect:
[null,[],null,null,["a"],"hello world",null,"hello world hello hello world",null,["d","dddd"],""]
Which is actually wrong.
August 4, 2019 4:30 AM
the first diagram is wrong which dir 'a' and 'b' are parent and sub relationship rather than on the same level.
February 18, 2018 6:15 AM
Great article. Cleared a lot of doubts.
Last Edit: February 8, 2021 3:20 AM
I have a question about the klogk part of the time complexity (O(m+n+klog(k))) for the ls command. Quoting the solution, "k refers to the number of entries(files+subdirectories) in the last level directory(in the current input). We need to sort these names giving a factor of klog(k)." I agree that since there are k entries, there are klogk comparisons. But each of those entries is a string, say, of length s. Shouldn't we, therefore, factor the O(s) time it takes to compare each string? If yes, the time complexity would be O(m+n+ s * k(logk))
You don't have any submissions yet.
xxxxxxxxxxclass FileSystem {public: FileSystem() { } vector<string> ls(string path) { } void mkdir(string path) { } void addContentToFile(string filePath, string content) { } string readContentFromFile(string filePath) { }};/** * Your FileSystem object will be instantiated and called as such: * FileSystem* obj = new FileSystem(); * vector<string> param_1 = obj->ls(path); * obj->mkdir(path); * obj->addContentToFile(filePath,content); * string param_4 = obj->readContentFromFile(filePath); */